PSCI 2270 - Lecture 2
Department of Political Science, Vanderbilt University
September 2, 2024
Causal Theories
From Theory to Hypothesis
Operationalization of Theory
Learning about Population from Sample
Descriptive: Summarize data, investigate facts, discover hidden patterns
Predictive: Forecast events based on co-occurance with other events/factors
Causal: Answer what-if’s
Correlation: is any statistical association, though it commonly refers to the degree to which a pair of variables are linearly related.
Causation: indicates that one event is the result of the occurrence of the other event; i.e. there is a causal relationship between the two events. This is also referred to as cause and effect.
Confounder: (also confounding variable, omitted variable, or lurking variable) is a variable that influences both the dependent variable and independent variable, causing a spurious association.
Suppose there are two factors that we know are positively correlated with each other (e.g. when \(X\) is higher, \(Y\) tends to be higher too).
\(X\) usually refers to independent/explanatory variable; \(Y\) – to dependent/outcome variable
Claim: Internet searches for the word “flu” increase the incidence of flu in the city from which the search arose.
Evidence: The more people in a city who do a Google search for the word “flu”, the more cases of flu there tend to be in that city.
Claim: Smoking causes lung cancer.
Evidence: People who smoke are more likely to contract lung cancer than people who don’t smoke.
Claim: Cell phone access increases violent protest.
Evidence: When a region in Africa gets cell phone coverage the frequency of violent political protests in the next year goes up.
Claim: Experience of civil war causes people to develop more violent personalities.
Evidence: The more years of civil war a country has experienced since 1945, the more yellow and red cards its nationals get in club and international soccer matches.
Claim: Giving a student high grades causes them to perform better on standardized tests.
Evidence: Teenagers with higher high school GPAs get better scores on their SATs.
Claim: Super Bowl appearances are bad for health in the team’s home town.
Evidence: If you live in a town whose team makes it to the Super Bowl you are more likely to die from the flu in that year.
A DAG displays assumptions about the relationship between variables (nodes).
Do NOT do THIS! 😵
Theories are made up of concepts (nodes):
Concepts are latent:
Indicators are concrete:
Important to consider how do we construct indicators
Sometimes there is slippage between latent concept and proxy, e.g.
Important to make measurement as unobtrusive as possible
Operational definition (Indicator):
Reliability:
Validity:
Question: “How much say do you have in getting the government to address issues that interest you?”
Problem?
Solution: Try to anchor responses with vignettes with different levels of “objective” efficacy and ask the
“Objective” ranking: Alison \(>\) Jane \(>\) Moses
Place respondent on the scale
Question: “Do you agree or disagree with the following statement: Men are better leaders than women?”
Problem?
Sample across Arab World asked about equality:
Combine this information:
Every concept requires a unit of analysis
Many concepts can be measured at multiple levels, e.g. if we want to measure wealth:
At the individual level: Income? From wages? From capital gains? Assets? Consumer products? Calories consumed?
At the country level: GDP? GDP/capita? Energy consumption? Infant mortality rate?
Learning about Population from Sample
Descriptive Statistics
Types of Data Collection